03. Biases in Data Collection
Lesson 3 03 Biases In Data Collection
Biases in Data Collection
Summary of the types of biases in Data Collection
Selection Bias
- Non response bias
- Here is an interesting take on how to mitigate non response bias.
- Voluntary bias
- Random sampling can provide strong protection against voluntary response bias.
- Undercoverage
- This Harvard Business Review article by Kate Crawford illustrates this nicely using the example of Hurricane Sandy.
Response Bias
- Leading Questions
- Social Desirability
- This post by the experience management company Qualtrics describes another example of response bias: “acquiescence response bias”.
Missing Variables
- Features that are not included as a part of data collection that affect the analysis and final recommendation
Survivorship Bias
- Brands that exist in collection today but their churn indicate implications on an analyses and its interpretation
Additional Resources on Overcoming Selection and Response bias
- Simple random sampling is cited as a way to address biases during data collection. Check out this blog describing some method.
- If you are interested in academic papers, we also recommend reading an article by the title "Addressing Selection Bias in Event Studies with General Purpose Social Media Panels" by Princeton faculty Han Zhang and Microsoft Researchers Shawndra Hill and David Rothschild.
Quiz 1
QUIZ QUESTION::
Match the Selection Bias Example to Type Response options:
ANSWER CHOICES:
|
Selection Bias |
Type |
|---|---|
Asking a survey group their salaries as part of a feature dataset. |
|
Polling a college group about their political preferences as a feature set to represent the larger population’s opinion. |
|
Asking a group of older residents to take a 5 question survey via smartphone as part of a feature dataset. |
|
Phoning entrepreneurs to ask about their financial growth and primarily getting responses from companies that are growing. |
|
Running polls in urban areas as part of a feature dataset. |
SOLUTION:
|
Selection Bias |
Type |
|---|---|
|
Polling a college group about their political preferences as a feature set to represent the larger population’s opinion. |
|
|
Phoning entrepreneurs to ask about their financial growth and primarily getting responses from companies that are growing. |
|
|
Running polls in urban areas as part of a feature dataset. |
|
|
Asking a survey group their salaries as part of a feature dataset. |
|
|
Asking a group of older residents to take a 5 question survey via smartphone as part of a feature dataset. |
|
|
Phoning entrepreneurs to ask about their financial growth and primarily getting responses from companies that are growing. |
|
|
Running polls in urban areas as part of a feature dataset. |
|
|
Asking a survey group their salaries as part of a feature dataset. |
|
|
Asking a group of older residents to take a 5 question survey via smartphone as part of a feature dataset. |
Lesson 3 08 Q A Convienient Sample
Quiz 2
SOLUTION:
- Random sampling is a good way to reduce __response bias__.
- To guard against bias from undercoverage, use a __convenience sample__.
- To guard against __nonresponse bias__, use a mail-in survey.